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The Stochastic Green Function (SGF) algorithm is able to simulate any Hamiltonian that does 
not suffer from the so-called "sign problem". We propose a new global space-time update scheme 
for the SGF algorithm which greatly reduces auto-correlation times relative to previous updating 
schemes. Using as a concrete example the Bose-Hubbard model and the complex Hamiltonian with 
six-site ring-exchange interactions which was recently studied in arXiv:1206.2566 i^l, we present a 
comprehensive review of the SGF algorithm and the new updating scheme. We also discuss an 
optimized implementation of the updating scheme which allows for access to large system sizes. 
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I. INTRODUCTION 

Monte Carlo methods were first introduced more than 
60 years ago [l[ to solve various classical problems. The 
development of quantum Monte Carlo (QMC) 0-0] al- 
lowed the extension of those methods to quantum sys- 
tems. For fermions, the power of QMC methods remains 
limited by the so-called sign problem, which prevents the 
treatment of systems with large sizes. While the Determi- 
nant Quantum Monte Carlo algorithm Q allows to treat 
fermions exactly in some particular cases, it is often nec- 
essary to make use of approximate methods such as the 
fixed- node approximation (Bj, and the dynamical cluster 
approximation 0] • On the other hand there arc many 
interesting bosonic systems which do not suffer from the 
sign problem, and which can be described efficiently us- 
ing a worldline representation Q. Consequently, over 
the last two decades, there have been tremendous ad- 
vances in boson QMC methods, such as the development 
of the Stochastic Series Expansion (SSE) algorithm 
the loop algorithm [l^ which makes use of global up- 
dates in the grand-canonical ensemble, or the Reptation 
Quantum Monte Carlo algorithm In recent years 

the canonical worm (CW) algorithm [l2j, which works 
in the canonical ensemble with global updates, was pro- 
posed. The method was generalized later to a wider class 
of Hamiltonians, leading to the Stochastic Green Func- 
tion (SGF) algorithrn [l3i. and improved with a new type 
of directed updates |14{ . 

Recently, the generality of the SGF algorithm made it 
possible to perform exact studies of complex systems, 
such as multispecies systems with interspecies conver- 
sions [Tsl - figj . systems with fully connected graphs [20| . 
and systems described by Hamiltonians with six-site cou- 
pling terms [21 1. 

In this paper we review the theory of the SGF algo- 
rithm and propose a new global space-time update with 
highly reduced auto-correlation times. The paper is orga- 
nized as follows: In section II we describe the properties 
and the framework of the SGF algorithm. In section III 
we present the new global space-time update scheme and 
derive the expression of all associated probabilities that 



satisfy detailed balance. We detail the differences be- 
tween this new update and the updates that are used in 
the SSE and CW algorithms. Section IV gives full details 
on measurements. In particular we describe how non- 
trivial quantities can be measured, such as the specific 
heat, the dynamical structure factor, the entropy, or n- 
point Green functions. We give in section V a simple ex- 
tension that allows the algorithm to simulate exactly the 
grand-canonical ensemble. We propose in Section VI an 
efficient implementation of the algorithm. We illustrate 
in section VII the exactness of the algorithm by making 
comparisons between the SGF algorithm and exact di- 
agonalizations, and show that the new global space-time 
update leads to smaller auto-correlation times. Finally 
we conclude in section VIII. 



II. GENERALITIES 

A. Properties 

The SGF algorithm is caracterizcd by the following 
properties: 

1. It can be applied to any sign-problem- free Hamil- 
tonian. 

2. It is completely independent of the structure of 
the Hamiltonian, no particular decomposition is re- 
quired. 

3. As a corollary, it is possible to write a single com- 
puter code that can simulate any sign-problem-free 
Hamiltonian. 

4. The acceptance rate of every update is 100%. The 
benefit of that is efficiency, as no cpu time is wasted 
with useless rejected updates and implementation 
simplicity because the changes made in the config- 
uration during an update do not need to be stored. 

5. It can simulate both the canonical and the grand- 
canonical ensembles. 
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6. It makes use of a global space-time update (see sec- 
tion , which reduces the auto-correlation time 
(see section WU^ . 

7. It allows for the measurement of high-order corre- 
lation functions, such as n-point Green functions. 

8. It works in continuous imaginary time and is exact 
(no errors beyond statistical errors). 

9. It ensures the ergodicity. In particular, the winding 
is sampled. 

The above properties 1 — 7 illustrate the differences be- 
tween the SGF algorithm and other existing algorithms. 
In particular, while other algorithms can treat only a spe- 
cific class of Hamiltonians (usually Hamiltonians that can 
be written as a sum of two-site coupling terms), the SGF 
algorithm can be directly applied to any sign-problem- 
free Hamiltonian "as is" . 

In order to give a concrete illustration of the above 
abilities of the SGF algorithm, we consider the follow- 
ing two Hamiltonians: The first Hamiltonian is the one- 
dimensional Bosc-Hubbard model, which is familiar: 



a]a, + H.c. 



1) 



(1) 



The operators al and are the creation and annihilation 
operators of a boson on site i, which satisfy the usual 
boson commutation rules, and fii ~ aja^ is the number 
operator on site i. The sum over all pairs of 

first- neighboring sites i and j. The second Hamitlonian 
describes soft-core bosons on a two-dimensional kagome 
lattice interacting via an onsitc repulsion potential and 
a sextic ring-exchange term. The Hamiltonian takes the 
form: 



n = -tJ2{ala^+H.c.) 



Ol"03"'05"06"'04"'02 



H.C. 



o 



niifii - 1) 



The sum J2o '-'^'^'^ hexagons of the lattice. The in- 
dices Ol — o6 in the sextic term denote the sites of a given 
hexagon O. This Hamiltonian has been studied recently 
in reference [2l| in the hard-core case. Because of the 
sextic term which couples six sites at a time, simulating 
this Hamiltonian with usual methods is cumbersome. We 
demonstrate below that it is straightforward to simulate 
it with the SGF algorithm. 



B. The (extended) partition function and the 
Green operator 

Like many other QMC algorithms, the SGF algorithm 
operates directly on physical states. An occupation num- 



ber basis is chosen, B = {| ■'/')}; where \ip) is a config- 
uration in the occupation number representation. We 
consider a Hamiltonian written in the form 



n = v-r, 



(3) 



where V is diagonal in the basis B, and T is off-diagonal 
and assumed to have positive matrix elements. The SGF 
algorithm does not require any further assumptions on 
the Hamiltonian. Defining the inverse temperature /?, 
the purpose of the algorithm is to sample the partition 
function 



Z{f3) = Tr e-^'^. 



(4) 



To this end, we define the Green operator Q by its matrix 
elements for all pairs of states |L) and 



{L\g\R) = g{n), 



(5) 



where n is the number of creations and annihilations 
needed to transform the state into the state |L), 
and g is an arbitrary function (see section HI), with the 
constraint g{Q) = 1, that is all diagonal matrix elements 
of Q are equal to 1. In the following, n will be referred to 
as the offset of the Green operator. By breaking up the 
exponential in ^ at imaginary time r and introducing 
the Green operator between the two parts, we can define 
an extended partition function: 



Z(/3,T) = Tre 



(6) 



Defining r(r) = e^'^-fe'^^ and Gir) = e'^^Qe-^^, 
and using the equality 

where Tr' denotes the time-ordering operator over the 
variable r', the extended partition function takes the 
form: 

Z(/3, r) = Tr e-^^T,, \eJ' nr')dr' g^^^^f,^ t(r')<ir'l 

By expanding the exponentials in ([8|) and ordering the 
operators in imaginary time, we get: 



(2) Z(/3,r)==Tre-'^*^ 



TiT„)---nrL,mrL,)TiTL)gir) 

0<Ti<---<r„</3 



X f{TR)f{TR,)T{TR,)- ■■T{Tl)dTl ■■■dTn (9) 

Note that we have used the labels L, Li, L2., ... (resp. 
R, i?2, ■•■) for the first, second, third, ... time indices 
that appear on the left (resp. right) of Q{t). By introduc- 
ing n complete sets of states with k ~ 1- ■ - n 

between each T and Q operators, and an extra set with 
fc = for the trace, the extended partition function takes 
the final form 

X {Li\f{TL)\L){L\g{T)\R){R\f{TR)\Ri) 



X •••(i|r(Ti)|o)dTi 



■ dT„, 



(10) 



where we have used the notation Vk (here with k = 0) 
to denote the matrix element (fc|V|fc). As a result, a 
configuration of the extended partition function (fTU|) is 
determined by a set of n time indices and n + 1 states 
Since the states \k) are connected to each other by single 

T operators, it is actually simpler to generate the set of 
n+1 states starting from the two states and |_R^ and 

a set of n particular terms 71- of the operator T. In the 
following, a configuration for which |l) = will be 
referred to as a diagonal configuration, and a particular 
term Tfc will be called term with space index k. 

In our example (|2]), the V and T operators are identi- 
fied as 

V = ;7^n,(n,-1), (11) 

i 

f = tJ2{ala^+H.c.) 

+ kY1 («Oia03«05«06«04a02 + H.c.) , (12) 
O 

and a term Tq can be either a kinetic or a ring-exchange 
term. Fig. [1] shows a possible configuration for the ex- 
tented partition function associated with this example. 

The extended partition function Z{I3, r) is a sum of 
diagonal configurations that belongs to the actual par- 
tition function 2^(/3), and non-diagonal configurations. 
More precisely, 

Z(/?,t) =Z(/?) 

+^Tr e-^^-^^*\L){L\g\R){R\e-^^. (13) 

The purpose of the algorithm is to evolve from a diago- 
nal configuration to another, via non-diagonal configura- 
tions. The role of the Green operator Q is to allow the 
transition from a configuration to another one by prop- 
agating across the operator string and letting the offset 
fluctuate. By satisfying detailed balance (See section|TTT|, 
the configurations of the extended partition function (jlOp 
can be generated with an extended Boltzmann weight. 
Then all quantities of interest can be estimated using 
those configurations (See section HV|) . 

III. GLOBAL SPACE-TIME UPDATE AND 
DETAILED BALANCE 

In previous formulations of the SGF algorithm [l3l.[l3|. 
the updating procedure was affecting only space indices 
and a maximum of two time indices of the operators of 
the extended partition function pO|) . In this section, 
we propose a new directed update procedure, the "global 
space-time update" , that is able to globally update both 
space and time indices of the operators of the extended 
partition function. This represents a new improvement, 
not only over past versions of the SGF algorithm, but 
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FIG. 1. (Color online) A possible configuration of the ex- 
tended partition function Z{(3,t) for the Hamitlonian ((2)1. 
The kagome lattice is represented in black with its primary 
directions a\ and 0,2 at imaginary times and /3. The particles 
(yellow) represent the current state of the trace (same state at 
times and /3). Some kinetic (red) and ring-exchange (blue) 
terms are distributed in both space and time. The Green op- 
erator (green) reconnects the broken worldlines (gray). In the 
present case, the offset (purple) is n = 4. 



also over other algorithms. For example, in the SSE al- 
gorithm a single loop can update only a restricted region 
of space (Fig. [2]), and an additional procedure is neces- 
sary to update the time indices of the operators. In the 
CW algorithm, while a single update is able to affect the 
entire space, it is unable to update more than two time 
indices (Fig.[3l). 

The new global space-time update we propose over- 
comes those weaknesses by generating a completely new 
portion of the operator string. In addition the update 
is directed, that is to say, the width of time window 
[Ti]Tf] visited by the Green operator can be controlled 
via an optimization parameter. Also, the length of the 
operator string can fluctuate by any number with a sin- 
gle update. Therefore the auto-correlation time between 
different configurations is reduced, resulting in a better 
sampling (See section |VII|) . 



A. Global space-time update scheme 

An easy way to generate configurations of the extended 
partition function consists in starting from an initial con- 
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FIG. 2. (Color online) A typical configuration in the SSE rep- 
resentation with a loop update. Diagonal operators (blue) do 
not affect the worldlines (gray), while non-diagonal operators 
(red) allow to break the worldlines. In order to update the 
configuration, a loop (black dotted line) is constructed, the 
occupation numbers of the sites that are visited are increased 
or decreased, and the vertices are updated with diagonal or 
non-diagonal operators. The loop can jump vertically be- 
tween two vertices, or horizontally across the same vertex. 
Because horizontal jumps are not permitted between two dif- 
ferent vertices, a loop that opens in a given region (cyan, 
yellow, or pink) must close in the same region. As a result, 
only a restricted region of space can be updated with a single 
loop. An additional procedure is necessary in order to update 
the time indices of the vertices, and redistribute diagonal op- 
erators over the space to give the next loop a chance to update 
a new region of space. 



figuration where Q is the only operator present in the 
operator string (jlOp . with an arbitrary imaginary time 
index r and an arbitrary initial state, = \R)- Then, 
while propagating to the left (increasing time) or to the 
right (decreasing time), the Green operator can either 
drop a T operator behind (creation) or pick up a 7~ op- 
erator ahead (destruction). Each creation introduces a 
new projector I?/)) (-01, and selects a single term of T. 
Each destruction removes a projector a-nd a sin- 

gle term of T. More precisely, assuming a propagation 
of Q to the left, a creation corresponds to the following 
transition 

{L\g\R) ^ {L\g\^P){ip\f\R), (14) 

and a destruction corresponds to 

{Li\f\L){L\g\R) ^ {Li\g\R). (15) 

The idea of the global space-time update is to give to 
the Green operator a chance to perform several creations 
and destructions while propagating in the same direction. 




FIG. 3. (Color online) Typical initial and final configurations 
in the representation of the CW algorithm. Non-diagonal 
operators (red) appear with given space (black labels) and 
time indices. In order to update the configuration, a worm 
operator (blue) is propagated in time and changes the space 
indices of the operators that are visited. The worm operator 
can create an operator with a new time index only at the 
beginning of the propagation, and destroy an operator only 
at the end. As a result, a single update can affect only a 
majcimum of two time indices, and the length of the operator 
string can fluctuate only by +1,0,-1. 



and allow both space and time indices of the terms of the 
T operators to be updated. Thus, while propagating in 
the same direction, the Green operator can update the 
system over the whole space in a time window [ri^Tf] 
whose average width can be controlled. The scheme for 
this global space-time update is shown in Fig. HI A direc- 
tion of propagation to the left or to the right is chosen for 
the Green operator. Then an action, creation or destruc- 
tion of a T operator, is chosen. If creation is chosen, 
then a new imaginary time index r' is chosen for g in 
the time window [rij;rL], a state for a new projector is 
chosen, and a Tir') operator is created behind g. If de- 
struction is chosen, then a T operator is destroyed ahead 
of g. After the action has been performed, a decision to 
loop in the same direction or to stop is made. The algo- 
rithm continues to loop until it chooses to stop. Then a 
new time index is chosen in the new time window [r^; r^] 
for g, and the update is over. 

Note that when the Green operator crosses the peri- 
odic imaginary time boundaries, the state of the trace is 
updated. This update ensures the ergodicity of the algo- 
rithm, since any permitted term of the T operator can be 
inserted in the operator string at any imaginary time in- 
dex. Reciprocally, any term encountered can be removed. 
Thus any configuration is accessible from a given initial 
configuration in a finite number of iterations. 

The advantage of this global space-time update is that 
it is able to fully update a controllable portion of the 
operator string. Fig. [5] shows two configurations of the 
operator string associated with the Hamiltonian ^ that 
are accessible from each other with a single update. 



B. Detailed balance 

In order to generate operator strings with the (ex- 
tended) Boltzmann weight, detailed balance must be sat- 
isfied. For this purpose we consider a transition between 
an initial and a final configuration. 



5 



Create 



Choose New 
Time Index 
For New T 



Choose New 
State 



Create T 




can be factorized as 



Destroy 



Destroy T 



Choose New 
Time Index 
For G 




FIG. 4. (Color online) The global space-time update scheme. 
A T operator can be created with any space and time indices, 
and any T operator that is encountered can be destroyed. 
Sequences consisting of several creations and destructions can 
be performed while propagating in the same direction (see 
text for details). 
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FIG. 5. (Color online) An example of global space-time up- 
date for the SGF algorithm in the case of the Hamiltonian 
Kinetic (red) and ring-exchange (blue) operators can be cre- 
ated or destroyed. The labels on the operators correspond to 
a particular term (space index). The Green operator (green) 
is propagated to the left from Ti to r/ and can update both 
space and time indices of the kinetic and ring-exchange op- 
erators that are encountered. In addition, the length of the 
operator string can fluctuate by any number and the aver- 
age width of the time window [r^; Tf] is controllable (directed 
update). 



Defining the Boltzniann weight Pi (and P/) of the ini- 
tial (and final) configuration, and the probability Wi^f 
(and Wf^i) to make a transition from the initial to the 
final (and from the final to the initial) configuration, the 
detailed balance takes the form: 

P,W.,^f = PfWf^, (16) 

The propabilities Wi^f and Wf^i to make a transition 



S,^fA. 



(17) 

(18) 



where Si-^f (resp. Sf-^i) is the probability to propose 
the transition from the initial to the final (resp. from the 
final to the initial) configuration, and Ai^f (resp. Af^i) 
is the probabiUty to accept the proposed transition. In 
the following, we make use of the Metropolis solution 



Ai^f = min(l,gi^/), 
Af^i = min (l,q/_,i), 

where the acceptance factors and g/- 



If- 



Pi Si- 



PfSf- 



(19) 
(20) 

are given by 
(21) 
(22) 



C. Determination of probabilities 

The global space-time update involves several proba- 
bility functions that need to be determined. The choice 
of these functions is arbitrary, the only requirements are 
that ergodicity and detailed balance must be satisfied. 
Thus we have some freedom that allows us to make a 
choice that will be convenient and efficient. We consider 
the following probabilities: 

• P^^{a), with a =<—,—>, the probability to choose 
a propagation of the Green operator in the cr direc- 
tion, conditionned by the states and time indices 
with labels L and R. 

• P^^'{'\), the probability to choose a creation, con- 
ditionned by L, R, a. 

• P^^{t), the probability to choose a new time in- 
dex for a T operator or for the Green operator, 
conditionned by L, R. 

• P^^'li)), the probability to choose the state -0 for 
a new projector l?/;)^?/;!, conditionned by L, R, a. 

• P^^(O), the probability to loop, conditionned by 
L, R, a. 

In order to satisfy detailed balance, we must be able to 
evaluate the acceptance factor gj^i"2 '"" for any sequence 
consisting of n actions of type in the direction cr, where 
a; is either a creation (c) or a destruction (d). Our pur- 
pose is to make a suitable choice for the above proba- 
bilities, such that all acceptance factors associated with 
all possible sequences in any direction reduce to a single 
acceptance factor q. 

For this purpose, let us consider a sequence that con- 
sists of a propagation to the left with a single creation. 
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The corresponding acceptance factor is g^. In the fol- 
lowing, we use primed (non-primed) labels for the final 
(initial) configuration. The Boltzmann weight Pi of the 
initial configuration is proportional to 



P., oc {L\g{T)\R) 



-Vr) 



(23) 



The hidden matrix elements of the operator string are 
unchanged by the update .The Boltzmann weight of the 
final configuration is proportional to 

Pf oc {L'\g{T')\R'){R'\f{Tn/)\R[) 

oc(L'|^|i?')(i?'|r|i?'i)e^'(^^-^«)e"«''''«"^«i^(24) 

The probability Si^f to propose the transition from the 
initial configuration to the final configuration is the prob- 
ability to choose a propagation to the left, times the prob- 
ability to do a creation, times the probability to choose 
the time index t^/ for the new 7~, times the probabil- 
ity to choose the new state \R'), times the probability to 
stop the update, times the probability to choose the time 
index t' for Q: 



P^^\^)Pll'^{])P^^{TR,)PiL"iR') 



l-P 



L'R' 



(0))P^^^ (r') 



(25) 



The probability to propose the reverse update from the fi- 
nal configuration to the initial configuration is simply the 
probability to choose a propagation to the right, times 
the probability to do a destruction, times the probability 
to stop the update, times the probability to choose the 
time index r for Q: 



5/^, = P^«(^)(l-Pi;^(t)) 
X {l- P^^{0))P^^{t) 



(26) 



Putting everything together, and realizing that Vl' = Vl, 
and Vr' ~ Vr. the acceptance factor can be written as: 



9^ = 



{L\g\R'){R'\f\R) 
{L\g\R)pLR{R') 



^r{V^-Vn) pi'«'(r') P^«(TflO 

P^'^'(^)(i - Pi;'^'(t)) (1 - P^^(O)) 

pM(^)pLi?,(|)(i _ P^'«'(0)) 



(27) 



We notice that the initial and the final times of the Green 
operator, r and r', appear only in the second factor 
of ((27|) . This suggests that it is possible to make the 
acceptance factor independent of those times by using 
an exponential distribution for the time probability. 



P^^(r) 



if AF = ' ^ ^ 



1 

Ar 



where we have defined AV — Vl~ Vr and At = tl — th. 



In order to favor the states that are important, we can 
choose the new states with a probability that is propor- 
tional to the weight of the new matrix elements. Thus 
we use the following distributions: 



{L\g\^){ip\f\R) 

{L\gf\R) 
{L\f\^){^\g\R) 



(29) 



(30) 



Injecting p8|) and 
takes the form 



{L\Tg\R) 

in (|27|) . the acceptance factor 



(31) 



, _ {L\gf\R){l-P'fiO)) 

" (Plglp)pi^(^)p^^(t) 

P^'^'(^)(l - P^'^''(t))(e^"'^^' - 1) 
Ay'(l-P^'«'(0)) ' 

and is written as a quantity that depends on the initial 
configuration, times a quantity that depends on the final 
configuration. Note that the reverse update corresponds 
to a propagation to the right with a destruction, hence 
q'^ = l/q'^. As a result, the acceptance factor of a left 
propagation with a destruction, g^, is obtained by in- 
verting pip , switching the direction, and exchanging the 
primed labels with the non-primed ones: 

d _ AF(1-P^«(0)) 



Pi«(^)(l-pi«(t))(l-e- 
{L'\g\R')P^'^'{^)Pl;^\^) 
{L'\fg\R'){l-Pl^R'{0)) 



AtAV\ 



(32) 



For a uniform sampling, we can impose the acceptance 
factor of a left propagation and creation to be equal to the 
acceptance factor of a left propagation and destruction, 
q1_ = g^. This allows to determine the probability of 
creation 



^r(t) 



{L\gf\R) 1 
{L\g\R) ft'' 
{L\fg\R) 1 
{L\g\R) J 



LR ' 



where we have defined 

{L\gf\R) 
{L\g\R) 

{L\g\R) 



AV 



,-ArAy 



AV 



1 



oArAy ■ 



(33) 



(34) 



(35) 



(36) 



The acceptance factors for a creation or a destruction 
become: 



[I- Pt"iO))ft^ P^'-R'(^) 

~L'R' 



PLR{^) (l_pL'ff(Q))j 



(37) 
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Let us consider now a propagation to the left and a se- 
quence of two creations, with a corresponding acceptance 
factor gjf; . The probabihty of the initial configuration is 
given by (P^ . The probability of the final configuration 
is: 

Pf = (L'|c?(r')|i?')(^'|t(rHO|i?'i)(i?'i|r(r;^,)|i?^) 
= {L'\g\R'){R'\f\R[){R[\f\R'^) 



(38) 



The probability to propose a transition from the initial 
configuration to the final configuration is the product: 

X P^^'l (0)Pi:^'l (t)F^^'l {TR')Pt"'^ (i?') 

X (1-P^'^''(0))P^'^'(t') (39) 

In the same way, the probability to propose the reverse 
transition from the final configuration to the initial con- 
figuration takes the form: 

Sf^, = p^'^'(^)(i - Pi;'^'(t))Pi;'^kc3)(i - P'-'^Ht)) 

X (l-Fi;^(0))P^^(T) (40) 

Using our previous definitions and the fact that Vl' ~ Vl 
and Vr' = Vr, the acceptance factor takes the form: 



(l-P^^(O))/. 

pLR'i ^Q-^ f^'^ 



LR 



P 



L'R' 



(l-P^'«'(0))/i'«' 



(41) 



and is written as a product of quantities that depend 
on the initial, intermediate, and final configurations, re- 
spectively. One can see from the above expression that a 
suitable choice for P^^(O) allows to make the acceptance 
factor independent of the intermediate configuration. A 
possible solution is 



Pi'^(O) = amin I 1, 



LR 



LR 



(42) 



where a is the opposite direction of ct, and a is an op- 
timization parameter to be chosen in [0;1[. With this 
choice, the acceptance factor of any update becomes to- 
tally independent of the sequence of creations and de- 
structions, and reads 



P^'^'{d)Q^^{a) 
P^R{a)Q^'R'{ay 



with 



Q"^(-)=/."^(l-amin(l,/^^//,^^) 



(43) 



(44) 



Finally we can impose the acceptance factor of a propa- 
gation to the left to be equal to the acceptance factor of 
a propagation to the right, = q^. This is realized if 



^LR 



Qi«(a) + Q^«(a)' 



(45) 



and, defining Q^^ = Q^^(^) + (5^^(^), we are left 
with a single acceptance factor for any update: 



Q 



LR 



(46) 



Because the acceptance factor p6|) is written as a ratio of 
a quantity that depends only on the initial configuration 
and a quantity that depends only on the final configura- 
tion, an ultimate simplification can be done. Using (j46p 
and defining Qi ~ Q^^ and Q f = ^ , we can rewrite 
(PT|) as: 



PiQiSi- 



(47) 



The above equation can be interpreted as follows: Ac- 
cepting all transitions with a probability of 1 is equiva- 
lent to sampling the partition function with the pscudo- 
Boltzmann weight PQ instead of the true Boltzmann 
weight P. The quantum average of any operator A can 
be obtained with a simple renormalization: 



(A/Q) 



PQ 



(48) 



PQ 



By construction, the function Q never diverges nor van- 
ishes. Hence the renormalization is well defined in any 
case. Note that we have explicitly excluded the value 1 
from the allowed values for a. This prevents the probabil- 
ity of looping, P^^{0), from being systematically equal 
to the unity in diagonal configurations, otherwise no mea- 
surements would be possible. The advantage of accepting 
all transitions with a probability of 1 is that no CPU time 
is wasted with useless rejected updates, and there is no 
need to record the changes made in the operator string 
during the update. Also, accepting a "bad" transition 
from time to time may help the system to escape from a 
local minimum of the energy. 

By adjusting the value of a for the probability to loop, 
one can tune the "directionality" of the update, that is 
to say the average length of the sequence of creations 
and destructions of the update. Having long sequences 
of creations and destructions reduces the probability for 
an update to undo the changes made in the configuration 
with the previous update. This also increases the width 
of the time window [t^jT^^] of the operator string that 
is updated, and highly reduces the auto-correlation time 
(see section IVlH) . 

The choice of the function g{n) determines how the ex- 
tended space of configurations is sampled. In section ITVl 
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it is shown that it is necessary to generate diagonal con- 
figurations in order to make measurements. Thus g(n) 
must be a function that decreases with n sufficiently fast 
in order to have a chance to generate diagonal configura- 
tions. However, non-diagonal configurations are needed 
in order to update the system. The choice of g{n) must 
be done is such a way that all non-diagonal terms have a 
comparable probability to be introduced in the operator 
string. In practice, we find that the choice g{n) ~ 
where L is the number of lattice sites is a good choice for 
Hamiltonians for which the highest order of non-diagonal 
terms is 2, like the Hamiltonian ([T]). For our non-trivial 
example ([2]), a better choice is: 



1 



gin) 



n = 

1/L n = l,2,3,4 
l/L"-3 ^ > 5 



(49) 



D. Summary of probabilities 

The total weight of creation and destruction for a par- 
ticular direction of motion, a =— >, -s— , and for particular 
|l) and states, can be parametrized as: 



w- 



where is the weight of creation. 



^ _ {L\gf\R) 
{L\g\R) 

and W~ is the weight of destruction, 

'AtAV 



w- 



AV 



coth 



(50) 

(51) 
(52) 

(53) 
1 and 



where we introduced the symbol s^- with = 
= +1. 

In this notation the absolute creation probability is 



Equivalently the destruction probability is 

w- 



PA-) 



The probability to loop is 



P„ (O) = a min ( 1 



(54) 



(55) 



(56) 



The relative probability to pick a particular direction 
reads 



Destruction weight 



Creation weight 



Loop probability 



Direction weight 



Renormalization weight 



Time index weight 



Space index weight 



AV- [coth (^^) - s„\ 



W+ = 



w+ 



L gr R 



L\g]R 
i re Lr 



L g R 



P„ (O) = amm 1 



W-+W+ 



i(a) = (ty- + w/+)(i-p-(o)) 



-Q(^) + Q(^) 



P(r) 



P^^(^) = 



L g ^ T R 



L\gT\R 

L I r U ) ( i/. g 1 H 



L rg R 



TABLE I. Summary of the update probabilities. 



As shown in Fig. 51 the algorithm starts by picking a par- 
ticular direction with relative probability Q{(j). Then it 
decides if it will destroy or create an operator with rel- 
ative probabilities and W~ respectively. If creation 
is chosen the new time index is chosen within the inter- 
val [t^,t/,] with relative probability distribution e'^^^ 
and the space index with relative probability given by 
Eq. ([^^ and (|30p . At the end of the creation or destruc- 
tion the algorithm decides if it wants to continue, with 
absolute probability P„ (O), or stop and start over. All 
the update probabilities are summarized in table H] and 
shown graphically in Fig. [Sj 



l-P-^(O) 



Move Left 



Left 



Create Destroy I^ig^t 



Left 



Move 
Right 



Create 



Destroy 
Right 



l-P^(O) 



wt w+ 

FIG. 6. (color online) Graphical representation of the update 
probabilities. First the algorithm picks a direction of mo- 
tion with a relative probability represented by the gray tiles. 
Their total area is the Boltzmann weight. Once the direction 
is fixed it will pick a destruction or creation of an operator 
with relative probabilities represented by the colored tiles of 
the corresponding side. The height of those tiles represents 
the probability to keep moving in the same direction after a 
creation or destruction. 



g(a) = VF,(1-P^(0)). 



(57) 
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IV. 



MEASUREMENTS 



A. Quantities represented by diagonal operators 



By definition of the extended partition function ([6]), 
the states |L) and of the Green operator arc associ- 
ated with the extended Boltzmann weight: 



PiL,R) 



{L\g\R){R\e-^^\L) 



(58) 



This extended Boltzmann weight can be used to measure 
any operator A. Indeed, consider the expectation value 
of A: 



Tr Ae-^^ 



(59) 



By introducing a complete set of states, and using the 
fact that all diagonal matrix elements of Q are equal to 1, 
Eq. ([59|) can be rewritten as 



E {L\A\R){R\e-l''^\L) 



L,R 



L\A 



P{L,R) 



E 5l,bP{,L,R) 

LM 



(60) 



where 5l,r is the Kronecker delta. By performing an 
importance sampling (denoted S^^) over S samples of 
states |i) and with the distribution P{L,R), the 
expectation value reduces to 



E 



(A) = lim 



lim 





A 






g 





E ^L.R 

1 ^ {l\a\r) 
{L\G\R)" 



(61) 



where Nd is the number of diagonal configurations in the 
set of samples. 

If we make the choice of accepting all updates with a 
probability of 100%, it is necessary to perform the renor- 
malization (l48ll . The estimator for A becomes: 



E 



(A) ^ lim 



{l\a/q\r) 
{l\g\r) 



s^^ J2 {l\i/q\r) 



(62) 



qLR 



It is important to note that measurements can be done 
only at the end of a global space-time update, when the 
loop is over, that is to say when detailed balance is sat- 
isfied. 



For diagonal operators, only diagonal configurations 
|L) = have a non- vanishing contribution, and 

Eg. ([62]) reduces to: 



j:{^\A/Q\i') 



(A) = lim 



s^^ E(^|i/Q|^) 



(63) 



This is the case for the diagonal energy V (jIV C 2[) . the 
supcrfluid density (jIV C 31 and IIV C 4p . time-dependent 
density-density correlation functions pV C 5p . and the 
dynamical structure factor pV C 6p . 



B. Quantities represented by non-diagonal 
operators 

Non-diagonal operators can be measured "on the fly" 
with Eq. ([5^ , while exploring the extended space of con- 
figurations. The numerator has contributions from non- 
diagonal configurations |L) ^ while the denomina- 
tor is evaluated in diagonal configurations |L) = 
This is the case for Green functions and the momentum 
distribution function pV C 7\i . The non-diagonal energy 
T can also be measured by decomposing it as a sum 
of Green functions. However it is possible to measure 
it using diagonal configurations only pVC2p . which is 
simpler and more efficient. 



C. Special measurements 

1. Integrals over imaginary time 

Many quantities of interest are defined as an integral 
of the form 



1= /^a(r)/(r)dT, 
/o 



(64) 



where 0{t) = e'^^C'e"'^^, O being an operator and 
/(r) an arbitrary function. This is the case, for in- 
stance, for the improved estimator of the diagonal energy 
(sec IIV C 2| and the Fourier transform of the local den- 
sity f see IIV C~6)l . Therefore, it is necessary to understand 
how integrals like ([64)1 can be evaluated exactly within 
the SGF framework. 

For this purpose, consider a configuration C„ of the 
operator string with length n, imaginary time indices 
"J"! 1 ''"2 , ■ ■ • , TVi , and the convention that tq = and 
Tn+i = For any time index r £ [0;/3[, there exists 
a unique k such that Tk < t < Tk+i- This implies the 
identity 

n 

^e(rfe<r<rfe+i) = l Vre[0;/3[, (65) 

fc=0 
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where Q(arg) = 1 if arg is true, otherwise. The inte- 
gral ()64p in this particular configuration can be decom- 
posed as 

( f d{T)f{T)dT)^ ={ f da{T)f{T)dT)^ 

JO " Jo 

+ ( r dn{T)f{T)dT) , (66) 

Ja 



The non-diagonal energy T can be measured easily by 
averaging the length of the operator string. More gener- 
ally, the energy associated with any non-diagonal Hamil- 
tonian term can be measured by counting the number of 
occurences of that term in the operator string with diag- 
onal configurations. Indeed, consider the decomposition: 



(71) 



where Od and On are the diagonal and non-diagonal 
parts of O, respectively. For diagonal configurations, all 
worldlines are straight between two consecutive time in- 
dices, Tfc and Tfc+i. As a result, the diagonal part Od has 
a constant expectation value between and r^+i and, 
using (|65| . the integral of the diagonal part Od in the 
configuration C„ takes the form 

■^0 fc=0 

X fT{r)dT, (67) 

where ipk'^^ labels the state between the time indices Tk 
and Tfc+i. 

For non-diagonal configurations. On can have a contri- 
bution only between the time indices tl and r^, where 
the worldlines are discontinuous. We must consider the 
case Tji < Tl for which we have th < t < tl, and the 
case Tl < tr for which wc have either < t < /? or 
<T < TL- By defining the notation 



^ I Tr<T <TL if Tr < TL 

\ {tr<t < 13) or {0<T < tl) if tl < tr ' 

(68) 

the integral of the non-diagonal part On in the configu- 
ration Cn can be written 

/3 



( / On{r)f{T)dT), ={L\On\R) 



X / /(r)e-^>^e(r-^)dr.(69) 
Jo 



The expectation value of a particular term Tfe is given 
by: 



{%) = ^Tr %e-P^ 



1 d -P{v-Y.f,~i%) 

'T'y p i-^k 



(72) 



Rewriting the exponential in (|72|) as 



-/3 V- 



-/jTk{T)dT 

-T.— — - 



(73) 



where Uk is the number of occurences of the TX term in 
the operator string, the expectation value of Tk takes the 
simple form 



Tr ef^Tre° ^* 



/ E r,(r)<ir /rfc(r)<ir 
i^k \SL ' 



nk 



statistical weight of n^ 



13 



(nk). 



(74) 



In particular, the non-diagonal energy T is given by av- 
eraging the total length n of the operator string: 



1 



3. The superfluid density at finite temperature 



(75) 



2. The energy 

The diagonal energy V can be measured directly in 
diagonal configurations by using (|63p . However, a better 
estimator can be constructed by taking advantage of the 
invariance by imaginary time translation: 



1 



/3 



(V) = ^( I nr)dT) 



(70) 



The integral in (|70p can be evaluated using ((S7)) with 
dd = V and /(r) = 1. 



The superfluid density can be easily measured via the 
winding number. For a d-dimensional system, the wind- 
ing number is a vectorial operator, W, whose components 
measure the number of times that the worldlines cross 
the boundaries of the system in the primary directions 
of the lattice. Considering a configuration C„ of the op- 
erator string with length n and imaginary time indices 
"J"!; '''21 ■ ■ ■ J ''ni we define the associated pseudo-current in 
the C direction at time r as 



n 

fc=i 



(76) 
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where Vj, measures the discontinuity of the worldhnes in 
the ( direction introduced by the T operator acting at 
time Tfc . In our example ^ , a kinetic term acting in the 
C direction gives = ±1, while a ring-exchange term 
systematically gives T^j. = 0. By definition, the winding 
in the C direction is obtained by integrating the pseudo- 
current 

1 



n 



(77) 



where Lq is the number of sites in the C direction. Thus 
the winding can be easily measured in the configuration 
Cn by using Eg. ((77)) . For identical particles in a cubic 
lattice with L'^ sites, it has been shown [2^ that the 
dimensionless superfluid density is given by 



Ps 



{W^)L 



2-d 



2dtl3 



(78) 



where t = 2ma^ ' ^ mass of one particle, and a is 

the lattice constant. This way, ps is easily obtained from 



the winding IW"^). 



that the discrete frequencies ojp become continuous in the 
limit /3 — >■ oo, and the "numerical observation" that the 

I ~ I 2 

values of j^(a;p) for p > 1 scale linearly with /3 at finite 

I ~ 1 2 

but low temperature. As a result, the value of x(0) at 
zcro-tcmpcrature can be estimated at finite temperature 
by performing a linear extrapolation. 

Therefore we define the improved estimator for the 
zero-temperature winding number as: 



;^(2|jc(wi) 



(80) 



Since both ui and uj2 vanish in the limit /3 — >■ cxd. 



we have lim 



/3^oo Wl 



Wi^. It turns out that {W^'^) 
shows a quasi-linear dependence in /3 at low temperature. 
Thus, when injected into ([78)) . the dependence in /3 is 
canceled and the measured superfluid density becomes 
independent of the temperature. 

Figure [7] shows the finite-temperature superfluid den- 
sity as a function of /? and the improved estima- 
tor for the zero-temperature superfluid density for 
the Bose- Hubbard model ([1]). As /3 increases, the 
value given by the improved estimator converges to 
the zero-temperature limit much faster than the finite- 
temperature superfluid density. 



4- Improved estimator for the zero-temperature superfluid 
density 

As we have seen above, the superfluid density ps can 
be easily measured via the winding number W . How- 
ever Ps can show a strong dependence in temperature, 
which makes it difficult to estimate its zero-temperature 
value. Measuring the zero-temperature superfluid den- 
sity requires, in principle, to perform simulations with 
increasing values of the inverse temperature /?, which is 
computationally expensive, and then perform an extrap- 
olation to /? — )■ oo. We propose here an improved esti- 
mator that gives the zcro-tcmpcrature superfluid density 
at arbitrary large temperature, thus making simulations 
easier. This improved estimator has been proposed by 
Batrouni and Scalettar for the discrete time Worldline 
algorithm Q. The generalization to continuous time is 
straightforward using our continuous-time definition of 
the pseudo current (|76|) . The improved estimator is ac- 
tually for the winding number, and we determine the 
superfluid density using (j78p . 

To this end, we define the Fourier transform of the 
pseudo-current 



k=l 



(79) 



with Wp = 2|P. One notices that = jf(0)/Lf . The 
idea of the improved estimator is based on the fact 




Bose-Hubbard model 
L=20 U/t=14 N=10 



Fiiiiie temperature p 

Improved estimator for zero-temperatiire p 



6 8 
Pt 
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FIG. 7. (Color online) The finite-temperature superfiuid den- 
sity and the improved estimator of the zero-temperature su- 
perfluid density as functions of the inverse temperature /3, for 
the Bose-Hubbard model ((T|. As /? increases, the value given 
by the improved estimator converges to the zero-temperature 
limit much faster than the finite-temperature superfluid den- 
sity. 



5. Time- dependent density- density correlation function 

Quantities like {npi(T')hf:{TYj can be measured di- 
rectly in diagonal configurations, since a given configura- 
tion Cn fully determines the values of np>(T') and np^r). 
However, in order to reduce the statistical fiuctuations, 
we might want to take advantage of the invariancc by 
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imaginary time translation and perform an average over 
the imaginary time axis. Also, if the Hamiltonian is in- 
variant by space translation, the correlation function will 
depend only on r ' — r, and fluctuations can be reduced 
further by summing over the space. 

As a result we consider the correlation function 

= T-rHI^ (f'P+f'ir + r')n,,{r'))dr', (81) 
p, Jo 

which depends only on the translation f and time r be- 
tween the two correlated points. By introducing two 
identities ((65|) . we get 

^ n n 

P p' k=Oq=0 

(82) 

with 

Fk,q{T) = (r + Tfc+1 - Tg)Q{Tg - Tk+1 <T< Tg+l - Tfe+l) 
+ (rq+1 - Tq)e{T > Tq+l - Tfc+l) 
+ (t + Tk- Tq)Q{Tq - Tk < T < Tg+l - Tk) 
+ {Tq+1 - r,)e(T > Tq + l ~ Tfe ) . (83) 

Again, Eq. ((82|) can be directly evaluated for any given 
configuration Cn- However it should be pointed that eval- 
uating Eq. ([82|) for a single vector r and time r is a 
process that has a complexity. If one is interested in 
getting full information on density-density correlations, it 
must be evaluated for all possible vectors r and a number 
of time samples that is proportional to /3. Thus the total 
complexity of the measurement process becomes . 
We show in subsection IIV C 6] that it is actually possible 
to obtain full information on density-density correlations 
with a -I- complexity by evaluating correlations 
in the Fourier space, then performing an invert Fourier 
transform. 



6. The dynamical structure factor 

The dynamical structure factor is the space-time 
Fourier transform of the time-dependent density-density 
correlation function ((8T|) . 

S{k, iOp) = ^J2 f a~(T)e<''-^^-"'^) dT (84) 
^P p JO 

with Ljp = As discussed in subsection IIV C 5l eval- 
uating directly ((8T|) for all vectors f and time r has a 
complexity, thus evaluating ([84|) for all vectors k 
and frequencies LUp may appear to be very difficult. Ac- 
tually, this can be done with a + LjP complexity 
by working directly in the w-space. For this purpose, 
we consider the Fourier transform of the time-dependent 
local density operator tip{t), which can be directly eval- 
uated in a configuration C„ by using (|67|) with Od = np 



and /(r) = e"^""^, 

P k=0 

X . (85) 

Evaluating ([55]) for all vectors r and frequencies ujp is a 
process that has a complexity. Consider now the 
time Fourier transform of ([8T|) : 

1 

P Jo 

f' 

Knowing hp{ujp), we can obtain Cp{ujp) for all r* and LOp 
with an additional L'^j3 complexity. Finally, the dynam- 
ical structure factor can be obtained by performing a 
space Fourier transform, 

^(fc,c.p) = i^aK^p)e'^-, (87) 

r 

which again has a complexity. So the total com- 
plexity for measuring the dynamical structure factor is 

-I- In addition, the time-dependent density- 

density correlation function ((5T|) can be obtained from 

for all vectors r and an arbitrary set of time indices 
by performing an invert time Fourier transform, 

Cr{T) =Y,Cp{u;p)e'^-\ (88) 
p 

again with a total complexity + 

7. Green functions and the momentum distribution 
function 

Green functions can be measured directly by us- 
ing Ea. ((62|) . For example, the 4-point Green func- 
tion Gijki = a\a^-af.ai receives contributions from non- 
diagonal configurations for which the matrix element 
(^L\Gijki/Q\R) is non-zero. In the contributing configu- 
rations, the denominator ^L|fj|i?) is equal to (7(4), so the 
choice of the function g(n) must be suitable for having a 
good frequency of measurement and a value of g(4) that 
is not too small, in order to avoid strong fiuctuations. 
However 17(4) should not be too big, in order to have a 
good chance of generating diagonal configurations that 
are needed for the normalization of the measurements. 

The momentum distribution function n(k) = (ata-) 

^ k k' 

measures the average number of particles with momen- 
tum hk. For a d-dimcnsional lattice with L sites, the cre- 
ation and annihilation operators at and d- of a particle 
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with momentum k are defined as the Fourier transforms 
1 



at = —S^a^e'''-^'' 

p 



(89) 
(90) 



where aj, and a^ are the creation and annihilation op- 
erators of a particle on sites p and g, respectively. The 
momentum distribution function takes the form 



(91) 



and can be obtained from 2-point Green functions, or the 
so-called one-particle density matrix, ppq = (aJ^Cg). 



8. The specific heat 

The specific heat is defined as the rate of change of 
the energy E = i^W) with temperature T, = f^|y, 
keeping the volume (number of sites) constant. A 
simple way to approximate Cy consists in perform- 
ing a numerical symmetric derivative of the energy, 
C„ w {E{T + 5T)-E{T-5T))/25T. This method works 
well, but requires to perform two simulations at temper- 
atures T — dT and T+5T in addition to the simulation at 
the temperature of interest T, and several attempts are 
usually needed in order to determine the "best" value for 
ST. 

It is actually possible to evaluate exactly with a 
single simulation, via the fluctuation-response theorem. 



(92) 



In principle all terms in Eq. (|92p can be measured as de- 
scribed in this section. However this requires to build the 
list of all terms present in "H^, whose number scales as 
for Hamiltonians with finite-range interactions (L* oth- 
erwise) . In addition to being computationally expensive, 
the sum of these terms may suffer from strong fluctua- 
tions. We show below that it is actually possible to evalu- 
ate ("H^) globally by performing diagonal measurements 
only. 

Since Ti. commutes with its exponential e~'^^, we can 
write 



for any imaginary times and rj. Thus, by integrating 
over C and ry, we get 



1 



-Tr e 



-I3V 



niT')dT'Yefonr)d. 



(94) 



By substituting Ti = V — T and using the equality 



' |7=1 

(95) 

the speciflc heat can be measured as 



Cy = {n{n -!)) + {[ I V{T)dT))-2{n V{T)dT) 



{ / V(r)dr) - (n) 
. Jo 



(96) 



where n is the length of the operator string and the inte- 
grals are computed using ([67|) with Od = V and /(r) = 1. 



9. The Entropy and the thermal susceptibility 

The statistical entropy is given by5 = k (^In Z + /3 (H)) , 
where k is the Boltzmann constant. Because the actual 
value of the partition function Z is unknown in QMC 
simulations, a direct measurement of S is not possible. 
However S can be evaluated by performing a numerical 
integration over the temperature T of the speciflc heat 



SiT) 



T ^ 

^dT', 



(97) 



which requires a set of simulations at different tempera- 
tures. 

We propose here an alternative which consists in per- 
forming a set of simulations in the grand-canonical en- 
semble (see section |V| at different values of the chemical 
potential. For simplicity, we consider here a Hamiltonian 
Ti. that describes identical particles, for example Eq.([T]) 
or Eq.Q. The grand-canonical partition function takes 
the form 



Tr e 



(98) 



where M is the operator that measures the total number 
of particles, and is the chemical potential. Our thermo- 
dynamic control parameters are the temperature T , the 
volume V (number of sites L), and the chemical potential 
/i. We define the thermal susceptibility xth as the rate of 
change of the average number of particles N = ^A/") with 
temperature T: 



XthiT,V,ii) = — 



(99) 



By substituting N = ^Tr Me'^V^"^'^) in Eq. and 

assuming that [■H,A/'] = 0, we get an expression for the 
thermal susceptibility that can be directly measured in 
our simulations, 



Xth = 



(100) 
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as explained in IIV C 2] Considering the energy E = (H) 
and the associated differential dE ~ TdS — PdV + ^dN, 
where the pressure is defined as P = — ^ 1 5 at ' ^^'^ 
performing a Legendre transformation over the vari- 
ables S and iV, we can define the grand-canonical po- 
tential D, that depends only on our natural variables, 
n{T, V,ii) ^ E-TS- ^iN ^ -PV. Its differential takes 
the form 



dn 



-SdT - PdV - Ndn. 



We can then extract a useful Maxwell relation, 



dS_ 



V,T 



dN 
'df 



(101) 



(102) 



so the entropy can be easily obtained by integrating the 
thermal susceptibility over the chemical potential and 
keeping the temperature and the volume constant, 



■I Un 



(103) 



where /io is the critical value of the chemical potential 
below which the average number of particles N and the 
thermal susceptibility Xth are vanishing. Recently, this 
method has been applied successfully to Hamiltonians 
describing diagonal and off-diagonal confinement [2^ . 



V. SIMULATION OF THE 
GRAND-CANONICAL ENSEMBLE 

Consider a general Hamiltonian % that describes 5* 
species of particles, and let Ns be the number of parti- 
cles of a given species s in an initial state. If it is possible 
to define P charges Mp that are conserved by the Hamil- 
tonian, 



A/i = 7i,iiVi + 71,2^2 H + li.sNs 

■1^2 = 72,1^1 + 72,2^^2 H + l2.sNs 



(104) 



A/p = 7p,iA^i + 7p,2^^2 



where 7^^ are integers, then the Hamiltonian commutes 
with A/i, A2, ■ ■ ■ , A/p, and the canonical ensemble is de- 
fined as the set of all states that contain exactly the same 
number charges as the initial state. 

By nature the SGF algorithm samples the canonical 
ensemble, since all states are generated from an initial 
state by successive applications of T operators. Simulat- 
ing the canonical ensemble can be convenient, especially 
for systems with several species [isl ,16>] . However it is 
sometimes useful to work in the grand- canonical ensem- 
ble [2l|, that is to say in the ensemble of states that 
contain any number of particles, especially for magnetic 
systems. Thus, the best solution is to have an algorithm 
that can simulate both ensembles. We describe below a 



simple extension that allows the SGF algorithm to sim- 
ulate exactly the grand-canonical ensemble. 

The idea is to add a non-conservative part Hnc to the 
Hamiltonian, 



(105) 



where j runs over all lattice sites and s runs over all 
species, and A is an optimization parameter. This non- 
conservative part allows the number of particles to fluc- 
tuate, so simulating the Hamiltonian l-Lnc will 
sample the grand-canonical ensemble. In order to make 
measurements that correspond to the actual Hamiltonian 
"H, we can perform a restricted simulation of %' by ap- 
plying the following conditions: 

1. We allow at most one term of Tine at a time in the 
operator string. 

2. Wc make measurements only if the operator string 
contains no T-Lnc terms. 

The first condition is not required, but it allows to in- 
crease the probability to make a measurement. The sec- 
ond condition ensures that the actual Hamiltonian % is 
exactly simulated in the grand-canonical ensemble. In- 
deed, the extended partition function Z'{/3,t) associ- 
ated with Ji' is a sum over configurations that contain 
any number of T-Lnc terms. Ignoring configurations with 
Hnc terms is equivalent to performing a renormalization 
of Z'{/3,t), such that the resulting extended partition 
function Z(P,t) is only the sum of configurations that 
do not contain T-Lnc terms. As a result, configurations 
with no Tine terms are generated with the correct Boltz- 
mann weight, and the grand-canonical ensemble associ- 
ated with the Hamiltonian Ti is simulated exactly. 

The value of A can be adjusted in order to tune the 
proportion of unphysical configurations with Tine terms. 
A should be large enough to allow a fast decorrelation of 
the number of particles between different configurations. 
But using a value too big reduces the probability of hav- 
ing a physical configuration and making a measurement. 
In practice we find that a good choice is A = 1/L, where 
L is the number of lattice sites. 

Finally, the average number of particles (-/V^) of species 
s can be adjusted via a chemical potential fis by adding 
the term —fisNs to the Hamiltonian. 



VI. IMPLEMENTATION 

Any implementation of the SGF algorithm will have 
to efficiently evaluate all the update probabilities that 
appear m table m In particular it will have to effi- 
ciently trace the values of the matrix elements {L\QT \ R), 
{L\1'G \R)^ the left and right matrix elements (■(/'|T|i?) 
and ("^l^li?) and the potential energy difference ISV . For 
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a Hamiltonian with arbitrary non-diagonal terms, for ex- 
ample with long range hopping, the only general way of 
keeping track of these quantities is to calculate on the 
fly those matrix elements. Therefore the cpu time of 
each update will scale linearly with the number of non- 
diagonal terms in the Hamiltonian. This scaling severely 
limits the system sizes that are accessible with implemen- 
tations that rely on recalculating all the porbabilities at 
each update. 

Fortunately, a major simplification is possible for the 
most physically relevant Hamiltonians, for which the 
non-diagonal terms couple a constant number of sites, 
as in the models of Eq. ^ and Eq. ([2]). In both these 
Hamiltonians the number of non-diagonal terms increases 
linearly in size and each non-diagonal term involves only 
a few neighboring site indices. In this section we will dis- 
cuss an update algorithm where the cpu time per update 
scales logarithmically with the number of non-diagonal 
terms. 

Without loss of generality we will consider the case 
with an insertion of a non-diagonal term to the right of 
G, while the latter is moving to the left. The relative 
probability of the new state after the insertion is 



W^^^(i?') = (i|e|i?'> {R'\T\R) 



(106) 



where the new matrix element of the green operator, 
{L\Q\R'), is evaluated using Eq. ([S]). This expression 
depends only on the updated offset of the Green opera- 
tor, that is to say the updated number of broken lines. 



n-LR' 



E 



(107) 



where the sum is over all sites i, and nf and nf are the 
corresponding occupancies of states \L) and \R') respec- 
tively. In most physical models each term, 7fc, in T is 
a product of just a few creation/annihilation operators. 



Therefore the occupancies of \R') and \R) 



such as c 

will differ only over few indices. We can separate out 
those indices in Eq. (|107p and write the updated number 
of broken lines as 



riLB.' = riLR + SriLR % 



(108) 



where Sulu Tfe is the additional number of lines broken 
by Tk and is given by 



SriLR 



E 

i in tk 



-Sn,{Tk) 



(109) 

where nf and nf are the occupancies of the i*'' index 
of \L) and \R) respectively, 5ni{Tk) is the change in oc- 
cupancy caused by 7X on index i, and the summation 
extends only over the indices of the operator Tk- We 



will refer to Sulu 



Tk 



as the "offset" of Tfe. Eq. (fT08)) 



and p09p suggest that the total number of broken lines 



can be updated by summing only over the usually few 
indices of Tk rather than all the indices of the configura- 
tion. 

Furthermore we can rewrite Eq. p06p as 



W'^"{R')^g(nLR + Sn 



ILR 



Tk 



rpR 

^k 1 
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where = {R'\ 71 is the matrix element of the term. 
The offset is an integer that can take only few values, for 
example it can only take the values -2, 0, and +2 for the 
model ((H) and the values -6, -4, -2, 0, +2, +A, and -1-6 
for the model ([2]). It can therefore be used to categorize 
the non-diagonal terms. Let S (Sn) be the group of non- 
diagonal terms with the same offset, Sn. For all those 
terms in S (Sn) the matrix element of the Green operator, 
9 {n-LR. + Sn), is the same. 

The key idea is that the choice of an operator can be 
done in two steps, first we choose an offset and then a 
term from the corresponding group. The relative prob- 
ability for choosing an offset is the sum of the relative 
probabilities of all the terms with the same offset and 
can be written as 



W^^^ {6n) = g {nLR + 5n) 



tk in r 



rpR;: 

"5nMLR{tky 



(111) 



where the summation extends over all the non-diagonal 
terms and the Kronecker delta selects the terms with 
a particular offset 5n. After the offset a non-diagonal 
term is chosen from the group S (Sn) with relative prob- 
ability Tjp. The normalization of all the probabilities, 
{L\gf\R), reads 



(112) 



Sn 



So far we discussed about the insertion of a term Tk to 
the right. Removing a term from the right is equivalent 
to inserting T^ as far as the configuration is concerned. 
Similarly an operator can be added to the left which is 
the same as adding T^ to the left or removed from the 
left which is equivalent to removing Tk- 

When a term is added or removed, it will affect the off- 
sets and the matrix elements, of the terms that share 
an index with it. An optimization occurs if instead of 
updating the matrix elements and offsets of every term 
wc update only the few affected terms. In most physi- 
cal models the number of non-diagonal terms that share 
any particular index does not change with the system 
size. For example if all non-diagonal terms are of the 
form cjcj, such as in the Bosc-Hubbard model ([1]), then 
in d dimensions, there are exactly Ad terms with any par- 
ticular index. Therefore if such a term is inserted with 
indices i and j there are 8d— 2 terms that will be affected. 

This observation leads to the development of a fast- 
update algorithm for SGF. When a term is inserted or 
removed, a few non-diagonal terms will need to be re- 
moved from the group of their original offset and added 
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to the group of their final offset. The pairs of [TkjT^j 

for each group can be stored in a binary search tree, one 
for each offset Sn and direction of insertion relative to 
Q. The binary search tree supports searching, inserting 
and removing of terms that scales logarithmically in the 
number of terms. There are many potential implemen- 
tations of such a binary search tree. One possibility is 
that the leaves correspond to the terms and their weight 
is their matrix element if they are members of the group 
or zero if they are not. The nodes have weights which 
are equal to the sum of the weights of their children. To 
choose a term one starts from the root and selects one of 
the children randomly with their weight as the relative 
probability and proceeds similarly until a leaf is reached. 
When a term is inserted/removed, only the weights of 
the nodes that arc immediate ancestors need to be up- 
dated. Therefore the logarithmic scaling is guaranteed. 
A simple picture of a binary search tree with 4 terms is 
shown in Fig. [8l 




FIG. 8. A binary search tree for a model with 4 non-diagonal 
terms. There is one such tree for each group of terms with 
the same offset 5n. The leaves represent the terms and their 
weights, r^f . The nodes are the sum of the weights of all their 
children. In this example, the term 7i is not part of the group 
because it has zero weight. The colored boxes represent the 
nodes that will be accessed if term is chosen, or when it 
is inserted, removed from the tree or its weight changes. To 
choose an operator, one starts from the root and proceeds to 
the leaves, by making a binary choice at every level with rel- 
ative probabilities equal to the stored weights of each node. 
When the weight of a term is updated during an insertion, re- 
moval, or a change of occupancy, the change is communicated 
to all its parents all the way up to the root. 



VII. TEST OF THE ALGORITHM 



A. Exactness of the SGF algorithm 

Figure [9] shows a comparison between SGF results and 
an exact diagonalization for the non-trivial model ^ 
in the hard-core limit, for 2 x 2 hexagons (12 sites), 
fit ~ 4, at half-filling. The total energy and the su- 
perfluid density are measured as explained in subsec- 
tions |TVU2] and [iVClll respectively. The agreement for 
the energy and the superfluid density is perfect in both 
small and large K/t limits. 




FIG. 9. (Color online) Comparison between the SGF algo- 
rithm with global space-time update and an exact diagonal- 
ization for the model with six-site ring-exchange terms in the 
hard-core limit. The agreement for the energy and the super- 
fluid density is perfect in both small and large K/t limits. 

The specific heat, measured as explained in subsec- 
tion IfV C 81 is shown on Fig. [TO] as a function of temper- 
ature T for K/t = 5 at half-filling. Again, the agreement 
is perfect. 

Cv 




2.5 5 7.5 10 



In this section we illustrate the exactness of the SGF 
algorithm with global space-time update by making com- 
parisons between QMC results and exact diagonaliza- 
tions. We also show that the new global space-time up- 
date leads shorter auto-correlation times relative to the 
previous formulation of the directed update (l^ . 



FIG. 10. (Color online) The specific heat Cv as a function 
of temperature T. Comparison between the SGF algorithm 
with global space-time update and an exact diagonalization 
for the model with six-site ring-exchange terms in the hard- 
core limit. The two curves show a perfect agreement. 
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B. Auto-correlation time 

Let us consider a random variable X and a set of M 
successive samples X{1),X{2), • • • , X{M). We define the 
auto-correlation function of X at separation r (shift in 
the sample indices) by 



Cx{t) 



1 



M 



M-T 

■E 



{X{t + r)-{X)){Xit)-{X)), 

(113) 

where (X) is the average value of X. If the samples X{t) 
are independent from each other, then Cx(t) is vanishing 
for all T 7^ 0. If the samples are correlated, then there 
exists a critical Tc such that Cx(j < Tc) is non-zero. In 
the following, Tc is refered to as the auto- correlation time. 



1. Global space-time update versus previous formu 
directed update 



\lation of 



Figure [TT] shows the auto-correlation functions of the 
potential energy, the kinetic energy, and the winding 
number for the model ^ in the superfluid and the Mott 
phase. For the potential energy and the winding number, 
the auto-correlation functions arc similar for the previ- 
ous formulation of the directed update [l^ and the new 
global space-time update. However the auto-correlation 
time for the kinetic energy is highly reduced with the 
global space-time update, especially in the Mott phase. 



Poieniiii] energy - Old directed update 

Potential energy - Global apace-time update 

Kinetic energy - Old directed update 

Kinetic energy - Global space-time update 

Winding number - Old directed update 

Winding number - Global spuce-time 
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FIG. 11. (Color online) The auto-correlation functions of the 
potential energy, the kinetic energy, and the winding number 
for the Bose-Hubbard model ([l| in the superfluid (top panel) 
and the Mott (bottom panel) phases. The auto-correlation 
functions are normalized such that their maximum values are 
equal to 1. 



2. Effect of the directionality 

The parameter a allows to control the directionality of 
the global space-time update. Figure \\% shows the auto- 



correlation functions of the kinetic energy, the exchange 
energy, and the winding number for the model ([2]). As 
expected, increasing a reduces the auto-correlation time. 
However, the number of measurements decreases as a 
is getting closer to 1. In practice we find that a good 
compromise for having a small auto-correlation time and 
a good statistics is to choose a in [0.90; 0.99]. 
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FIG. 12. (Color online) The auto-correlation functions of the 
kinetic energy Ehin (top panel), the exchange energy E^xc 
(middle panel), and the winding number W (bottom panel) 
for the Bose-Hubbard model with six-site ring-exchange 
The system consists of 6 x 6 hexagons (108 sites) in the hard- 
core limit with K/t — 8 and fit — 6. The auto-correlation 
functions are normalized such that their maximum values are 
equal to 1. 



VIII. CONCLUSION 

We have presented the Stochastic Green Function 
(SGF) algorithm and showed that it is able to simulate 
any Hamiltonian that does not suffer front the so-called 
"sign problem" . We have proposed a new global space- 
time update scheme which, in addition to being directed, 
has the advantage of reducing the auto-correlation time 
of the samples of measured quantities. The SGF algo- 
rithm is the first quantum Monte Carlo (QMC) method 
that does not make any assumption on the form of the 
Hamiltonian. As a result, it can be directly applied "as 
is" to any Hainiltonian. We have presented an optimized 
implementation where each update scales logarithmically 
with the system size, and which allows access to larger 
systems. We have illustrated the capabilities of the SGF 
algorithm by applying it to a Hamiltonian that includes 
six-site coupling terms, which is challenging for other 
QMG methods. In addition, we have shown that the 
SGF algorithm can work in both canonical and grand- 
canonical ensembles. Finally, we have shown that various 
quantities of interest can be measured by the algorithm, 
such as ri-point Green functions, time-dependent corre- 
lation functions, the dynamical structure factor, and the 
specific heat. 
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